Indexing and Selection

Operation	Syntax	Result
Select column	df[col]	Series
Select row by label	df.loc[label]	Series
Select row by integer	df.iloc[loc]	Series
Select rows	df[start:stop]	DataFrame
Select rows with boolean mask	df[mask]	DataFrame

documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html



In [ ]:

    
import pandas as pd
import numpy as np



In [ ]:

    
produce_dict = {'veggies': ['potatoes', 'onions', 'peppers', 'carrots'],'fruits': ['apples', 'bananas', 'pineapple', 'berries']}
produce_df = pd.DataFrame(produce_dict)
produce_df

selection using dictionary-like string



In [ ]:

list of strings as index (note: double square brackets)



In [ ]:

select row using integer index



In [ ]:

select rows using integer slice



In [ ]:



In [ ]:

+ is over-loaded as concatenation operator



In [ ]:

Data alignment and arithmetic

Data alignment between DataFrame objects automatically align on both the columns and the index (row labels).

Note locations for 'NaN'



In [ ]:

    
df = pd.DataFrame(np.random.randn(10, 4), columns=['A', 'B', 'C', 'D'])
df2 = pd.DataFrame(np.random.randn(7, 3), columns=['A', 'B', 'C'])
sum_df = df + df2
sum_df

Boolean indexing



In [ ]:



In [ ]:

first select rows in column B whose values are less than zero

then, include information for all columns in that row in the resulting data set



In [ ]:



In [ ]:

isin function



In [ ]:

where function



In [ ]: